SemanticScuttle - klotz.me » Tags: machine learning+classification

Tags: machine learning* + classification*

0 bookmark(s) - Sort by: Date ↓ / Title /

Accuracy Is Dead: Calibration, Discrimination, and Other Metrics You Actually Need

A deep dive into advanced evaluation for data scientists, discussing why accuracy is often misleading and exploring alternative metrics for classification and regression tasks like ROC-AUC, Log Loss, R², RMSLE, and Quantile Loss.

2025-07-18 Tags: data science, calibration, discrimination, roc-auc, log loss, regression, classification by klotz

Feature Engineering with LLM Embeddings: Enhancing Scikit-Learn Models

The article discusses using Large Language Model (LLM) embeddings as features in traditional machine learning models built with scikit-learn. It covers the process of generating embeddings from text data using models like Sentence Transformers, and how these embeddings can be combined with existing features to improve model performance. It details practical steps including loading data, creating embeddings, and integrating them into a scikit-learn pipeline for tasks like classification.

2025-07-18 Tags: llm, embeddings, feature engineering, scikit-learn, machine learning, sentence transformers, text data, classification, pipelines by klotz

Namers - Turftopic

This page details the topic namers available in Turftopic, allowing automated assignment of human-readable names to topics. It covers Large Language Models (local and OpenAI), N-gram patterns, and provides API references for the `TopicNamer`, `LLMTopicNamer`, `OpenAITopicNamer`, and `NgramTopicNamer` classes.

2025-07-15 Tags: topic modeling, llm, openai, n-grams, turftopic, python, machine learning, text analysis, classification, solon by klotz

Topic Model Labelling with LLMs

Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The article details training a FASTopic model and labeling its results using GPT-4.0 mini, emphasizing reproducibility and control over the labeling process.

2025-07-15 Tags: llm, machine learning, nlp, python, topic modeling, fastopic, turftopic, gpt-4, classification by klotz

Pairwise Cross-Variance Classification

Multi-class zero-shot embedding classification and error checking. This project improves zero-shot image/text classification using a novel dimensionality reduction technique and pairwise comparison, resulting in increased agreement between text and image classifications.

2025-06-04 Tags: classification, machine learning, multimodal learning, similarity search, embedding, image classification by klotz

Hands-On Attention Mechanism for Time Series Classification, with Python

This article demonstrates how to use the attention mechanism in a time series classification framework, specifically for classifying normal sine waves versus 'modified' (flattened) sine waves. It details the data generation, model implementation (using a bidirectional LSTM with attention), and results, achieving high accuracy.

2025-06-01 Tags: deep learning, python, time series, transformers, attention, lstm, classification, production engineering, observability by klotz

The Next AI Revolution: A Tutorial Using VAEs to Generate High-Quality Synthetic Data

This article discusses the use of variational autoencoders (VAEs) to generate synthetic data as a solution to the impending data scarcity for training large language models. It explores how synthetic data can address issues like imbalanced datasets, particularly using the UCI Adult dataset, by generating synthetic samples to balance the dataset and improve classification accuracy.

2025-02-23 Tags: synthetic data, variational autoencoders, imbalanced datasets, data generation, classification, machine learning, vae by klotz

A Complete Introduction to Using BERT Models

This article provides a comprehensive guide on the basics of BERT (Bidirectional Encoder Representations from Transformers) models. It covers the architecture, use cases, and practical implementations, helping readers understand how to leverage BERT for natural language processing tasks.

2025-02-07 Tags: bert, natural language processing, machine learning, transformers, text classification, sentiment analysis, llm by klotz

A Guide to Time-Series Sensor Data Classification Using UCI HAR Data

This article provides a hands-on guide to classifying human activity using sensor data and machine learning. It covers preparing data, creating a feature extraction pipeline using TSFresh, training a machine learning classifier with scikit-learn, and validating the model using the Data Studio.

2025-01-29 Tags: time-series, sensor, classification, uci har dataset, tsfresh, machine learning, feature extraction, product lion engineering, observability by klotz

How to Build a Text Classification Model using Hugging Face Transformers

A detailed guide on creating a text classification model with Hugging Face's transformer models, including setup, training, and evaluation steps.

2024-12-17 Tags: text classification, hugging face, transformers, machine learning, nlp by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: machine learning* + classification*

Linked Tags

Related Tags